145 research outputs found

    Differentially profiling the low-expression transcriptomes of human hepatoma using a novel SSH/microarray approach

    Get PDF
    BACKGROUND: The main limitation in performing genome-wide gene-expression profiling is the assay of low-expression genes. Approaches with high throughput and high sensitivity for assaying low-expression transcripts are urgently needed for functional genomic studies. Combination of the suppressive subtractive hybridization (SSH) and cDNA microarray techniques using the subtracted cDNA clones as probes printed on chips has greatly improved the efficiency for fishing out the differentially expressed clones and has been used before. However, it remains tedious and inefficient sequencing works for identifying genes including the great number of redundancy in the subtracted amplicons, and sacrifices the original advantages of high sensitivity of SSH in profiling low-expression transcriptomes. RESULTS: We modified the previous combination of SSH and microarray methods by directly using the subtracted amplicons as targets to hybridize the pre-made cDNA microarrays (named as "SSH/microarray"). mRNA prepared from three pairs of hepatoma and non-hepatoma liver tissues was subjected to the SSH/microarray assays, as well as directly to regular cDNA microarray assays for comparison. As compared to the original SSH and microarray combination assays, the modified SSH/microarray assays allowed for much easier inspection of the subtraction efficiency and identification of genes in the subtracted amplicons without tedious and inefficient sequencing work. On the other hand, 5015 of the 9376 genes originally filtered out by the regular cDNA microarray assays because of low expression became analyzable by the SSH/microarray assays. Moreover, the SSH/microarray assays detected about ten times more (701 vs. 69) HCC differentially expressed genes (at least a two-fold difference and P < 0.01), particularly for those with rare transcripts, than did the regular cDNA microarray assays. The differential expression was validated in 9 randomly selected genes in 18 pairs of hepatoma/non-hepatoma liver tissues using quantitative RT-PCR. The SSH/microarray approaches resulted in identifying many differentially expressed genes implicated in the regulation of cell cycle, cell death, signal transduction and cell morphogenesis, suggesting the involvement of multi-biological processes in hepato-carcinogenesis. CONCLUSION: The modified SSH/microarray approach is a simple but high-sensitive and high-efficient tool for differentially profiling the low-expression transcriptomes. It is most adequate for applying to functional genomic studies

    Methods for simultaneously identifying coherent local clusters with smooth global patterns in gene expression profiles

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The hierarchical clustering tree (HCT) with a dendrogram <abbrgrp><abbr bid="B1">1</abbr></abbrgrp> and the singular value decomposition (SVD) with a dimension-reduced representative map <abbrgrp><abbr bid="B2">2</abbr></abbrgrp> are popular methods for two-way sorting the gene-by-array matrix map employed in gene expression profiling. While HCT dendrograms tend to optimize local coherent clustering patterns, SVD leading eigenvectors usually identify better global grouping and transitional structures.</p> <p>Results</p> <p>This study proposes a flipping mechanism for a conventional agglomerative HCT using a rank-two ellipse (R2E, an improved SVD algorithm for sorting purpose) seriation by Chen <abbrgrp><abbr bid="B3">3</abbr></abbrgrp> as an external reference. While HCTs always produce permutations with good local behaviour, the rank-two ellipse seriation gives the best global grouping patterns and smooth transitional trends. The resulting algorithm automatically integrates the desirable properties of each method so that users have access to a clustering and visualization environment for gene expression profiles that preserves coherent local clusters and identifies global grouping trends.</p> <p>Conclusion</p> <p>We demonstrate, through four examples, that the proposed method not only possesses better numerical and statistical properties, it also provides more meaningful biomedical insights than other sorting algorithms. We suggest that sorted proximity matrices for genes and arrays, in addition to the gene-by-array expression matrix, can greatly aid in the search for comprehensive understanding of gene expression structures. Software for the proposed methods can be obtained at <url>http://gap.stat.sinica.edu.tw/Software/GAP</url>.</p

    Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Get PDF
    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3

    Microarray meta-analysis database (M2DB): a uniformly pre-processed, quality controlled, and manually curated human clinical microarray database

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Over the past decade, gene expression microarray studies have greatly expanded our knowledge of genetic mechanisms of human diseases. Meta-analysis of substantial amounts of accumulated data, by integrating valuable information from multiple studies, is becoming more important in microarray research. However, collecting data of special interest from public microarray repositories often present major practical problems. Moreover, including low-quality data may significantly reduce meta-analysis efficiency.</p> <p>Results</p> <p>M<sup>2</sup>DB is a human curated microarray database designed for easy querying, based on clinical information and for interactive retrieval of either raw or uniformly pre-processed data, along with a set of quality-control metrics. The database contains more than 10,000 previously published Affymetrix GeneChip arrays, performed using human clinical specimens. M<sup>2</sup>DB allows online querying according to a flexible combination of five clinical annotations describing disease state and sampling location. These annotations were manually curated by controlled vocabularies, based on information obtained from GEO, ArrayExpress, and published papers. For array-based assessment control, the online query provides sets of QC metrics, generated using three available QC algorithms. Arrays with poor data quality can easily be excluded from the query interface. The query provides values from two algorithms for gene-based filtering, and raw data and three kinds of pre-processed data for downloading.</p> <p>Conclusion</p> <p>M<sup>2</sup>DB utilizes a user-friendly interface for QC parameters, sample clinical annotations, and data formats to help users obtain clinical metadata. This database provides a lower entry threshold and an integrated process of meta-analysis. We hope that this research will promote further evolution of microarray meta-analysis.</p

    Molecular signature of clinical severity in recovering patients with severe acute respiratory syndrome coronavirus (SARS-CoV)

    Get PDF
    BACKGROUND: Severe acute respiratory syndrome (SARS), a recent epidemic human disease, is caused by a novel coronavirus (SARS-CoV). First reported in Asia, SARS quickly spread worldwide through international travelling. As of July 2003, the World Health Organization reported a total of 8,437 people afflicted with SARS with a 9.6% mortality rate. Although immunopathological damages may account for the severity of respiratory distress, little is known about how the genome-wide gene expression of the host changes under the attack of SARS-CoV. RESULTS: Based on changes in gene expression of peripheral blood, we identified 52 signature genes that accurately discriminated acute SARS patients from non-SARS controls. While a general suppression of gene expression predominated in SARS-infected blood, several genes including those involved in innate immunity, such as defensins and eosinophil-derived neurotoxin, were upregulated. Instead of employing clustering methods, we ranked the severity of recovering SARS patients by generalized associate plots (GAP) according to the expression profiles of 52 signature genes. Through this method, we discovered a smooth transition pattern of severity from normal controls to acute SARS patients. The rank of SARS severity was significantly correlated with the recovery period (in days) and with the clinical pulmonary infection score. CONCLUSION: The use of the GAP approach has proved useful in analyzing the complexity and continuity of biological systems. The severity rank derived from the global expression profile of significantly regulated genes in patients may be useful for further elucidating the pathophysiology of their disease

    Microarray labeling extension values: laboratory signatures for Affymetrix GeneChips

    Get PDF
    Interlaboratory comparison of microarray data, even when using the same platform, imposes several challenges to scientists. RNA quality, RNA labeling efficiency, hybridization procedures and data-mining tools can all contribute variations in each laboratory. In Affymetrix GeneChips, about 11–20 different 25-mer oligonucleotides are used to measure the level of each transcript. Here, we report that ‘labeling extension values (LEVs)’, which are correlation coefficients between probe intensities and probe positions, are highly correlated with the gene expression levels (GEVs) on eukayotic Affymetrix microarray data. By analyzing LEVs and GEVs in the publicly available 2414 cel files of 20 Affymetrix microarray types covering 13 species, we found that correlations between LEVs and GEVs only exist in eukaryotic RNAs, but not in prokaryotic ones. Surprisingly, Affymetrix results of the same specimens that were analyzed in different laboratories could be clearly differentiated only by LEVs, leading to the identification of ‘laboratory signatures’. In the examined dataset, GSE10797, filtering out high-LEV genes did not compromise the discovery of biological processes that are constructed by differentially expressed genes. In conclusion, LEVs provide a new filtering parameter for microarray analysis of gene expression and it may improve the inter- and intralaboratory comparability of Affymetrix GeneChips data

    Analysis of human meiotic recombination events with a parent-sibling tracing approach

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Meiotic recombination ensures that each child inherits distinct genetic materials from each parent, but the distribution of crossovers along meiotic chromosomes remains difficult to identify. In this study, we developed a parent-sibling tracing (PST) approach from previously reported methods to identify meiotic crossover sites of GEO GSE6754 data set. This approach requires only the single nucleotide polymorphism (SNP) data of the pedigrees of both parents and at least two of children.</p> <p>Results</p> <p>Compared to other SNP-based algorithms (identity by descent or pediSNP), fewer uninformative SNPs were derived with the use of PST. Analysis of a GEO GSE6754 data set containing 2,145 maternal and paternal meiotic events revealed that the pattern and distribution of paternal and maternal recombination sites vary along the chromosomes. Lower crossover rates near the centromeres were more prominent in males than in females. Based on analysis of repetitive sequences, we also showed that recombination hotspots are positively correlated with SINE/MIR repetitive elements and negatively correlated with LINE/L1 elements. The number of meiotic recombination events was positively correlated with the number of shorter tandem repeat sequences.</p> <p>Conclusions</p> <p>The advantages of the PST approach include the ability to use only two-generation pedigrees with two siblings and the ability to perform gender-specific analyses of repetitive elements and tandem repeat sequences while including fewer uninformative SNP regions in the results.</p

    Global Analyses of Small Interfering RNAs Derived from Bamboo mosaic virus and Its Associated Satellite RNAs in Different Plants

    Get PDF
    Background: Satellite RNAs (satRNAs), virus parasites, are exclusively associated with plant virus infection and have attracted much interest over the last 3 decades. Upon virus infection, virus-specific small interfering RNAs (vsiRNAs) are produced by dicer-like (DCL) endoribonucleases for anti-viral defense. The composition of vsiRNAs has been studied extensively; however, studies of satRNA-derived siRNAs (satsiRNAs) or siRNA profiles after satRNA co-infection are limited. Here, we report on the small RNA profiles associated with infection with Bamboo mosaic virus (BaMV) and its two satellite RNAs (satBaMVs) in Nicotiana benthamiana and Arabidopsis thaliana. Methodology/Principal Findings: Leaves of N. benthamiana or A. thaliana inoculated with water, BaMV alone or coinoculated with interfering or noninterfering satBaMV were collected for RNA extraction, then large-scale Solexa sequencing. Up to about 20% of total siRNAs as BaMV-specific siRNAs were accumulated in highly susceptible N. benthamiana leaves inoculated with BaMV alone or co-inoculated with noninterfering satBaMV; however, only about 0.1% of vsiRNAs were produced in plants co-infected with interfering satBaMV. The abundant region of siRNA distribution along BaMV and satBaMV genomes differed by host but not by co-infection with satBaMV. Most of the BaMV and satBaMV siRNAs were 21 or 22 nt, of both (+) and (-) polarities; however, a higher proportion of 22-nt BaMV and satBaMV siRNAs were generated in N. benthamiana than in A. thaliana. Furthermore, the proportion of non-viral 24-nt siRNAs was greatly increased in N. benthamiana after virus infection. Conclusions/Significance: The overall composition of vsiRNAs and satsiRNAs in the infected plants reflect the combined action of virus, satRNA and different DCLs in host plants. Our findings suggest that the structure and/or sequence demands of various DCLs in different hosts may result in differential susceptibility to the same virus. DCL2 producing 24-nt siRNAs under biotic stresses may play a vital role in the antiviral mechanism in N. benthamiana

    Targeted next-generation sequencing for the detection of cancer-associated somatic mutations in adenomyosis

    No full text
    Adenomyosis is a condition characterised by the invasion of endometrial tissues into the uterine myometrium, the molecular pathogenesis of which remains incompletely elucidated. Lesion profiling with next-generation sequencing (NGS) can lead to the identification of previously unanticipated causative genes and the detection of therapeutically actionable genetic changes. Using an NGS panel that included 275 cancer susceptibility genes, this study examined the occurrence and frequency of somatic mutations in adenomyotic tissue specimens collected from 17 women. Extracted DNA was enriched using targeted formalin-fixed paraffin-embedded tissue cores prior to the identification of lesion-specific variants. The results revealed that KRAS and AT-rich interactive domain 1A (ARID1A) were the two most frequently mutated genes (mutation frequencies: 24% and 12%, respectively). Notably, endometrial atypical hyperplasia did not involve adenomyotic areas. We also identified, for the first time, two potentially pathogenic mutations in the F-box/WD repeat-containing protein 7 (FBXW7) and cohesin subunit SA-2 (STAG2) genes. These findings indicate that mutations in the KRAS, ARID1A, FBXW7 and STAG2 genes may play a critical role in the pathogenesis of adenomyosis. Additional studies are needed to assess whether the utilisation of oncogenic driver mutations can inform the surveillance of patients with adenomyosis who had not undergone hysterectomy.Impact statement What is already known on this subject? Although somatic point mutations in the KRAS oncogene have been recently detected in adenomyosis, the molecular underpinnings of this condition remains incompletely elucidated. Lesion profiling with next-generation sequencing (NGS) can lead to the identification of previously unanticipated causative genes and the detection of therapeutically actionable genetic changes. What do the results of this study add? The results of NGS revealed that KRAS and AT-rich interactive domain 1A (ARID1A) were the two most frequently mutated genes (mutation frequencies: 24% and 12%, respectively). We also identified, for the first time, two potentially pathogenic mutations in the F-box/WD repeat-containing protein 7 (FBXW7) and cohesin subunit SA-2 (STAG2) genes. What are the implications of these findings for clinical practice and/or further research? The utilisation of oncogenic driver mutations has the potential to inform the surveillance of patients with adenomyosis who had not undergone hysterectomy
    corecore